Improving parameter learning of Bayesian nets from incomplete data

نویسندگان

  • Giorgio Corani
  • Cassio Polpo de Campos
چکیده

This paper addresses the estimation of parameters of a Bayesian network from incomplete data. The task is usually tackled by running the Expectation-Maximization (EM) algorithm several times in order to obtain a high log-likelihood estimate. We argue that choosing the maximum log-likelihood estimate (as well as the maximum penalized log-likelihood and the maximum a posteriori estimate) has severe drawbacks, being affected both by overfitting and model uncertainty. Two ideas are discussed to overcome these issues: a maximum entropy approach and a Bayesian model averaging approach. Both ideas can be easily applied on top of EM, while the entropy idea can be also implemented in a more sophisticated way, through a dedicated non-linear solver. A vast set of experiments shows that these ideas produce significantly better estimates and inferences than the traditional and widely used maximum (penalized) loglikelihood and maximum a posteriori estimates. In particular, if EM is adopted as optimization engine, the model averaging approach is the best performing one; its performance is matched by the entropy approach when implemented using the non-linear solver. The results suggest that the applicability of these ideas is immediate (they are easy to implement and to integrate in currently available inference engines) and that they constitute a better way to learn Bayesian network parameters.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Bayesian Network Learning with Parameter Constraints

The task of learning models for many real-world problems requires incorporating domain knowledge into learning algorithms, to enable accurate learning from a realistic volume of training data. This paper considers a variety of types of domain knowledge for constraining parameter estimates when learning Bayesian Networks. In particular, we consider domain knowledge that constrains the values or ...

متن کامل

Scaling Bayesian Network Parameter Learning with Expectation Maximization using MapReduce

Bayesian network (BN) parameter learning from incomplete data can be a computationally expensive task for incomplete data. Applying the EM algorithm to learn BN parameters is unfortunately susceptible to local optima and prone to premature convergence. We develop and experiment with two methods for improving EM parameter learning by using MapReduce: Age-Layered Expectation Maximization (ALEM) a...

متن کامل

An Introduction to Inference and Learning in Bayesian Networks

Bayesian networks (BNs) are modern tools for modeling phenomena in dynamic and static systems and are used in different subjects such as disease diagnosis, weather forecasting, decision making and clustering. A BN is a graphical-probabilistic model which represents causal relations among random variables and consists of a directed acyclic graph and a set of conditional probabilities. Structure...

متن کامل

The threshold EM algorithm for parameter learning in bayesian network with incomplete data

Bayesian networks (BN) are used in a big range of applications but they have one issue concerning parameter learning. In real application, training data are always incomplete or some nodes are hidden. To deal with this problem many learning parameter algorithms are suggested foreground EM, Gibbs sampling and RBE algorithms. In order to limit the search space and escape from local maxima produce...

متن کامل

Exploiting Parameter Domain Knowledge for Learning in Bayesian Networks

The task of learning models for many real-world problems requires researchers to incorporate problem Domain Knowledge into the learning algorithms because there is rarely enough training data to enable accurate learning of the structures and underlying relationships in the problem. Domain Knowledge comes in many forms. Domain Knowledge about relevance of variables (Feature Selection) can help u...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1110.3239  شماره 

صفحات  -

تاریخ انتشار 2011